VIT – Venice Italian Treebank: Syntactic and Quantitative Features

نویسندگان

  • Rodolfo Delmonte
  • Antonella Bristot
  • Sara Tonelli
چکیده

In this paper we will describe VIT (Venice Italian Treebank), created at the University of Venice. We will focus on the syntactic-semantic features and on the quantitative analysis of the data of our treebank comparing them to other treebanks. In general, we will try to substantiate the claim that treebanking grammars or parsers is dramatically dependent on the chosen treebank; and eventually this process seems to be dependent either from substantial factors such as the adopted linguistic framework for structural description or, ultimately, the described language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enriching the Venice Italian Treebank with Dependency and Grammatical Relations

Abstract In this paper we propose a rule-based approach to extract dependency and grammatical relations from the Venice Italian Treebank (VIT) (Delmonte et al., 2007) with bracketed tree structure. To our knowledge, the only dependency annotated corpus for Italian available is the Turin University Treebank (Lesmo et al., 2002), which has 25,000 tokens and is about 1/10 of VIT. As manual corpus ...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Italian Treebank lexico semantic annotation and reference lexical resource

The paper reports on the lexico semantic annotation level of the Italian Treebank the rst Italian corpus with a multi level anno tation morpho syntactic syntactic and lexico semantic The strategy of annotation and the reference lexical resource are described and the results achieved too

متن کامل

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

Grammatical relation’s system in treebank annotation

The paper presents theoretical aspects and practical issues related to the development of a grammatical relation’s system for corpus annotation. The grammatical relations are arranged on a default inheritance hierarchy based on syntactic and semantic features. Preliminary tests on the annotation of an Italian treebank (the Turin University Treebank) show that the system implements a reasonable ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007